PCNN: Projection Convolutional Neural Networks
59
FIGURE 3.14
We visualize the distribution of kernel weights of the first convolution layer of PCNN-22.
The variance increases when the ratio decreases λ, which balances projection loss and cross-
entropy loss. In particular, when λ = 0 (no projection loss), only one group is obtained,
where the kernel weights are distributed around 0, which could result in instability during
binarization. In contrast, two Gaussians (with projection loss, λ > 0) are more powerful
than the single one (without projection loss), which thus results in better BNNs, as also
validated in Table 3.2.
curves) converge faster than PCNNs with λ = 0 (yellow curves) when the epoch number
> 150.
Diversity visualization In Fig. 3.17, we visualize four channels of the binary kernels Dl
i
in the first row, the feature maps produced by Dl
i in the second row, and the corresponding
feature maps after binarization in the third row when J=4. This way helps illustrate the
diversity of kernels and feature maps in PCNNs. Thus, multiple projection functions can
capture diverse information and perform highly based on compressed models.
FIGURE 3.15
With λ fixed to 1e −4, the variance of the kernel weights decreases from the 2nd epoch to
the 200th epoch, which confirms that the projection loss does not affect the convergence.